43 research outputs found

    Open the box of digital neuromorphic processor: Towards effective algorithm-hardware co-design

    Full text link
    Sparse and event-driven spiking neural network (SNN) algorithms are the ideal candidate solution for energy-efficient edge computing. Yet, with the growing complexity of SNN algorithms, it isn't easy to properly benchmark and optimize their computational cost without hardware in the loop. Although digital neuromorphic processors have been widely adopted to benchmark SNN algorithms, their black-box nature is problematic for algorithm-hardware co-optimization. In this work, we open the black box of the digital neuromorphic processor for algorithm designers by presenting the neuron processing instruction set and detailed energy consumption of the SENeCA neuromorphic architecture. For convenient benchmarking and optimization, we provide the energy cost of the essential neuromorphic components in SENeCA, including neuron models and learning rules. Moreover, we exploit the SENeCA's hierarchical memory and exhibit an advantage over existing neuromorphic processors. We show the energy efficiency of SNN algorithms for video processing and online learning, and demonstrate the potential of our work for optimizing algorithm designs. Overall, we present a practical approach to enable algorithm designers to accurately benchmark SNN algorithms and pave the way towards effective algorithm-hardware co-design

    A 36 µW 1.1 mm2 reconfigurable analog front-end for cardiovascular and respiratory signals recording

    Get PDF
    © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other worksThis paper presents a 1.2 V 36 µW reconfigurable analog front-end (R-AFE) as a general-purpose low-cost IC for multiple-mode biomedical signals acquisition. The R-AFE efficiently reuses a reconfigurable preamplifier, a current generator (CG), and a mixed signal processing unit, having an area of 1.1 mm2 per R-AFE while supporting five acquisition modes to record different forms of cardiovascular and respiratory signals. The R-AFE can interface with voltage-, current-, impedance-, and light-sensors and hence can measure electrocardiography (ECG), bio-impedance (BioZ), photoplethysmogram (PPG), galvanic skin response (GSR), and general-purpose analog signals. Thanks to the chopper preamplifier and the low-noise CG utilizing dynamic element matching, the R-AFE mitigates 1/f noise from both the preamplifier and the CG for improved measurement sensitivity. The IC achieves competitive performance compared to the state-of-the-art dedicated readout ICs of ECG, BioZ, GSR, and PPG, but with approximately 1.4×-5.3× smaller chip area per channel.Peer ReviewedPostprint (author's final draft

    A structured and scalable test access architecture for TSV-based 3D stacked ICs

    No full text
    \u3cp\u3eNew process technology developments enable the creation of three-dimensional stacked ICs (3D-SICs) interconnected by means of Through-Silicon Vias (TSVs). This paper presents a DfT test access architecture for such 3D-SICs that allows for both pre-bond die testing and post-bond stack testing. The DfT architecture is based on a modular test approach, in which the various dies, their embedded IP cores, the inter-die TSV-based interconnects, and the external I/Os can be tested as separate units to allow optimization of the 3D-SIC test flow. The architecture builds on and reuses existing DfT hardware at the core, die, and product level. It adds a die-level wrapper, which is based on IEEE 1500, with the following novel features: (1) dedicated probe pads on the non-bottom dies to facilitate pre-bond die testing, (2) TestElevators that transport test control and data signals up and down during post-bond stack testing, and (3) a hierarchical Wrapper Instruction Register (WIR) chain. The paper also hints at opportunities for optimization and standardization of this architecture.\u3c/p\u3

    3D DfT architecture for pre-bond and post-bond testing

    No full text
    \u3cp\u3eProcess technology developments enable the creation of three-dimensional stacked ICs (3D-SICs) interconnected by means of Through-Silicon Vias (TSVs). This paper presents a 3D Design-for-Test (DfT) architecture for such 3D-SICs that allows pre-bond die testing as well as post-bond stack testing of both partial and complete stacks. The architecture enables on a modular test approach, in which the various dies, their embedded IP cores, the inter-die TSV-based interconnects, and the external I/Os can be tested as separate units to allow flexible optimization of the 3D-SIC test flow. The architecture builds on and reuses existing DfT hardware at the core, die, and product level. Its main new component is a die-level wrapper, which can be based on either IEEE Std 1500 or IEEE Std 1149.1. The paper presents a conceptual overview of the architecture, as well as implementation aspects. Experimental results show that the implementation costs are negligible for medium to large dies.\u3c/p\u3

    VWR2A: A Very-Wide-Register Reconfigurable-Array Architecture for Low-Power Embedded Devices

    Full text link
    Edge-computing requires high-performance energy-efficient embedded systems. Fixed-function or custom accelerators, such as FFT or FIR filter engines, are very efficient at implementing a particular functionality for a given set of constraints. However, they are inflexible when facing application-wide optimizations or functionality upgrades. Conversely, programmable cores offer higher flexibility, but often with a penalty in area, performance, and, above all, energy consumption. In this paper, we propose VWR2A, an architecture that integrates high computational density and low power memory structures (i.e., very-wide registers and scratchpad memories). VWR2A narrows the energy gap with similar or better performance on FFT kernels with respect to an FFT accelerator. Moreover, VWR2A flexibility allows to accelerate multiple kernels, resulting in significant energy savings at the application level

    VWR2A: A Very-Wide-Register Reconfigurable-Array Architecture for Low-Power Embedded Devices

    No full text
    Edge-computing requires high-performance energy-efficient embedded systems. Fixed-function or custom accelerators, such as FFT or FIR filter engines, are very efficient at implementing a particular functionality for a given set of constraints. However, they are inflexible when facing application-wide optimizations or functionality upgrades. Conversely, programmable cores offer higher flexibility, but often with a penalty in area, performance, and, above all, energy consumption. In this paper, we propose VWR2A, an architecture that integrates high computational density and low power memory structures (i.e., very-wide registers and scratchpad memories). VWR2A narrows the energy gap with similar or better performance on FFT kernels with respect to an FFT accelerator. Moreover, VWR2A flexibility allows to accelerate multiple kernels, resulting in significant energy savings at the application level

    A 119dB Dynamic Range Charge Counting Light-to-Digital Converter For Wearable PPG/NIRS Monitoring Applications

    No full text
    This paper presents a low power, high dynamic range (DR), reconfigurable light-to-digital converter (LDC) for photoplethysmogram (PPG), and near-infrared spectroscopy (NIRS) sensor readouts. The proposed LDC utilizes a current integration and a charge counting operation to directly convert the photocurrent to a digital code, reducing the noise contributors in the system. This LDC consists of a latched comparator, a low-noise current reference, a counter, and a multi-function integrator, which is used in both signal amplification and charge counting based data quantization. Furthermore, a current DAC is used to further increase the DR by canceling the baseline current. The LDC together with LED drivers and auxiliary digital circuitry are implemented in a standard 0.18 μm CMOS process and characterized experimentally. The LDC and LED drivers consume a total power of 196 μW while achieving a maximum 119 dB DR. The charge counting clock, and the pulse repetition frequency of the LED driver can be reconfigured, providing a wide range of power-resolution trade-off. At a minimum power consumption of 87 μW, the LDC still achieves 95 dB DR. The LDC is also validated with on-body PPG and NIRS measurement by using a photodiode (PD) and a silicon photomultiplier (SIPM), respectively.status: publishe

    A configurable and low-power mixed signal SoC for portable ECG monitoring applications

    No full text
    \u3cp\u3eThis paper describes a mixed-signal ECG System-on-Chip (SoC) that is capable of implementing configurable functionality with low-power consumption for portable ECG monitoring applications. A low-voltage and high performance analog front-end extracts 3-channel ECG signals and single channel electrode-tissue-impedance (ETI) measurement with high signal quality. This can be used to evaluate the quality of the ECG measurement and to filter motion artifacts. A custom digital signal processor consisting of 4-way SIMD processor provides the configurability and advanced functionality like motion artifact removal and R peak detection. A built-in 12-bit analog-to-digital converter (ADC) is capable of adaptive sampling achieving a compression ratio of up to 7, and loop buffer integration reduces the power consumption for on-chip memory access. The SoC is implemented in 0.18 μ m CMOS process and consumes 32 μ W from a 1.2 V while heart beat detection application is running, and integrated in a wireless ECG monitoring system with Bluetooth protocol. Thanks to the ECG SoC, the overall system power consumption can be reduced significantly.\u3c/p\u3
    corecore